Adjusting method and training system of machine learning classification model and user interface

ABSTRACT

An adjusting method and a training system for a machine learning classification model and a user interface are provided. The machine learning classification model is used to identify several categories. The adjusting method includes the following steps. Several identification data are inputted to the machine learning classification model to obtain several confidences of the categories for each of the identification data. A classification confidence distribution for each of the identification data whose highest value of the confidences is not greater than a critical value is recorded. The classification confidence distributions of the identification data are counted. Some of the identification data are collected according to the cumulative counts of the classification confidence distributions. Whether the collected identification data belong to a new category is determined. If the collected identification data belong to a new category, the new category is added.

This application claims the benefit of Taiwan application Serial No.109138987, filed Nov. 9, 2020, the disclosure of which is incorporatedby reference herein in its entirety.

TECHNICAL FIELD

The disclosure relates in general to an adjusting method and a trainingsystem for a machine learning classification model and a user interface.

BACKGROUND

In the object detection or category classification of the machinelearning classification model, it is possible that classification errorsor low classification confidence may occur. If the features of theidentified object are seldom included in the training data,identification correctness may become too low. Or, if the identificationbreadth of the machine learning classification model is too narrow andthe identified object has never been seen before, the identified objectmay be wrongly classified to an incorrect category and result in anidentification error.

The most commonly used method for resolving the above problems is toincrease the size of the original training data. However, the saidmethod, despite consuming a large amount of time and labor, can onlymake little improvement.

SUMMARY

The disclosure is directed to an adjusting method and a training systemfor a machine learning classification model and a user interface.

According to one embodiment, an adjusting method for a machine learningclassification model is provided. The machine learning classificationmodel is used to identify several categories. The adjusting methodincludes the following steps. Several identification data are inputtedto the machine learning classification model to obtain severalconfidences of the categories for each of the identification data. Aclassification confidence distribution for each of the identificationdata whose highest value of the confidences is not greater than acritical value is recorded. The classification confidence distributionsof the identification data are counted. Some of the identification dataare collected according to the cumulative counts of the classificationconfidence distributions. Whether the collected identification databelong to a new category is determined. If the collected identificationdata belong to a new category, the new category is added.

According to another embodiment, a training system for a machinelearning classification model is provided. The machine learningclassification model is used to identify several categories. Thetraining system includes an input unit, a machine learningclassification model, a recording unit, a statistical unit, a collectionunit, a determination unit and a category addition unit. The input unitis configured to input several identification data. The machine learningclassification model is configured to obtain several confidences of thecategories for each of the identification data. The recording unit isconfigured to record a classification confidence distribution for eachof the identification data whose highest value of the confidences is notgreater than a critical value. The statistical unit is configured tocount the classification confidence distributions of the identificationdata. The collection unit is configured to collect some of theidentification data according to the cumulative counts of theclassification confidence distributions. The determination unit isconfigured to determine whether the collected identification data belongto a new category. If the collected identification data belong to thenew category, the category addition unit adds the new category.

According to an alternative embodiment, a user interface for a user tooperate a training system for a machine learning classification model isprovided. The machine learning classification model is used to identifyseveral categories. After the machine learning classification modelreceives several identification data, the machine learningclassification model obtains several confidences of the categories foreach of the identification data. The user interface includes arecommendation window, a classification confidence distribution windowand a classification confidence distribution window. The recommendationwindow is configured to show several optimized recommendation data sets.When one of the optimized recommendation data sets is clicked, theclassification confidence distribution window shows a classificationconfidence distribution of the optimized recommendation data set whichis clicked.

The above and other aspects of the disclosure will become betterunderstood with regard to the following detailed description of thepreferred but non-limiting embodiment(s). The following description ismade with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a training system for a machinelearning classification model according to an embodiment.

FIG. 2 is a flowchart of an adjusting method for a machine learningclassification model according to an embodiment.

FIG. 3 is a schematic diagram of a user interface according to anembodiment.

In the following detailed description, for purposes of explanation,numerous specific details are set forth in order to provide a thoroughunderstanding of the disclosed embodiments. It will be apparent,however, that one or more embodiments may be practiced without thesespecific details. In other instances, well-known structures and devicesare schematically shown in order to simplify the drawing.

DETAILED DESCRIPTION

Referring to FIG. 1, a schematic diagram of a training system 1000 for amachine learning classification model 200 according to an embodiment isshown. The machine learning classification model 200 is used to identifyseveral categories. For example, during the semiconductor process, themachine learning classification model 200 identifies “scratch”, “crack”“and circuit” on a wafer image. After a wafer image is inputted to themachine learning classification model 200, several identification valuesare obtained and listed in Table 1. Since the confidence of the“scratch” category being the highest among all confidences is higherthan a predetermined value (such as 80%), an identification result being“scratch” is outputted.

TABLE 1 Category Confidence Scratch 92%  Crack 5% Circuit 2%

In another example, after a wafer image is inputted to the machinelearning classification model 200, several identification values areobtained and listed in Table 2. Since the confidence of the “crack”category being the highest among all confidences is still not higherthan a predetermined value (such as 80%), no identification result isoutputted. Unlike the training data of the machine learningclassification model 200 in which cracks always occur at the edge, thepresent wafer image has cracks at the central position and is unable toproduce a high confidence for the “crack” category. The training system1000 of the present disclosure can generate new data and train themachine learning classification model 200 using the generated data tooptimize the identification result.

TABLE 2 Category Confidence Scratch 6% Crack 72%  Circuit 3%

In another example, after a wafer image is inputted to the machinelearning classification model 200, several identification values areobtained and listed in Table 3. Although the confidence of the “scratch”category has little difference in comparison to the “crack” category,they are not higher than the predetermined value (such as 80%), and noidentification result can be outputted. The confidence of the “circuit”category is also extremely low. It is possible that the machine learningclassification model 200 does not have enough categories (for example,the machine learning classification model 200 should include a“micro-particle” category), so no category can produce a highconfidence. The training system 1000 of the present disclosure can add anew category for the identification data and train the machine learningclassification model 200 using the new category to optimize theidentification result.

TABLE 3 Category Confidence Scratch 32% Crack 35% Circuit  3%

Refer to FIG. 1. The training system 1000 of the machine learningclassification model 200 includes an input unit 110, a machine learningclassification model 200, an output unit 120, a recording unit 130, astatistical unit 140, a collection unit 150, a determination unit 160, acategory addition unit 170, a feature extraction unit 180, a datageneration unit 190 and a user interface 300. The functions of thoseelements are briefly disclosed below. The input unit 110, such as atransmission line, a transmission module, a hard disc, a memory or acloud data center, is configured to input data. The output unit 120,such as a transmission line, a transmission module or a display, isconfigured to output an identification result. The recording unit 130,such as a memory, a hard disc or a cloud data center, is configured torecord data. The statistical unit 140 is configured to count data. Thecollection unit 150 is configured to collect the data. The determinationunit 160 is configured to perform a determination process. The categoryaddition unit 170 is configured to add a new category. The featureextraction unit 180 is configured to extract features. The datageneration unit 190 is configured to generate data. The statistical unit140, the collection unit 150, the determination unit 160, the categoryaddition unit 170, the feature extraction unit 180, and the datageneration unit 190 can be realized by a circuit, a chip, a circuitboard, a programming code or a storage storing programming codes. Theuser interface 300 can be realized by a display panel of a mobiledevice.

The training system 1000 can supplementarily train the machine learningclassification model 200 using the feature extraction unit 180 and thedata generation unit 190 to improve the situation of Table 2. Moreover,the training system 1000 can supplementarily train the machine learningclassification model 200 using the category addition unit 170 to improvethe situation of Table 3. The operations of the above elements aredisclosed below with a flowchart.

Referring to FIG. 2, a flowchart of an adjusting method for the machinelearning classification model 200 according to an embodiment is shown.The machine learning classification model 200 is used to identifyseveral categories CG. In step S110, several identification data DT areinputted to the machine learning classification model 200 by the inputunit 110 to obtain several confidences CF of the categories CG for eachof the identification data DT. One confidence CF of each of thecategories CG can be obtained for each of the identification data DT.The category CG with the highest value of the confidences CF representsthe most likely category of the identification data DT.

Then, the method proceeds to step S120, for each of the identificationdata DT, if the highest value of the confidences CF is greater than acritical value (such as 80%), a corresponding category CG is outputtedby the output unit 120; if the highest value of the confidences CF isnot greater than a critical value, a classification confidencedistribution CCD of the confidences CF is recorded by the recording unit130.

Referring to Table 4, a classification confidence distribution CCD foran identification data DT is listed. Several confidence intervals, suchas 80% to 70%, 70% to 60%, 60% to 50%, 50% to 40%, 40% to 30%, 30% to20%, 20% to 10%, 10% to 0%, can be pre-determined for each of thecategories CG (for example, none of the above confidence intervalsincludes an upper limit). It should be noted that none of the confidenceinterval includes a range greater than the critical value. Theclassification confidence distribution CCD of Table 4 is a combinationof the “the scratch category has a confidence interval of 40% to 30%”,“the crack category has a confidence interval of 40% to 30%” and “thecircuit category has a confidence interval of 10% to 0%”.

TABLE 4 Confidence Category Confidence Interval Scratch 32% 40% to 30%Crack 35% 40% to 30% circuit  3% 10% to 0%

Referring to Table 5, a classification confidence distribution CCD foranother identification data DT is listed. The classification confidencedistribution CCD of Table 5 is a combination of the “the scratchcategory has a confidence interval of 60% to 50%”, “the crack categoryhas a confidence interval of 40% to 30%” and “the circuit category has aconfidence interval of 10% to 0%”. The classification confidencedistribution CCD of Table 5 is different from that of Table 4.

TABLE 5 Confidence Category Confidence Interval Scratch 66% 60% to 50%Crack 39% 40% to 30% Circuit  9% 10% to 0%

Referring to Table 6, a classification confidence distribution CCD foranother identification data DT is listed. The classification confidencedistribution CCD of Table 6 is a combination of the “the scratchcategory has a confidence interval of 40% to 30%”, “the crack categoryhas a confidence interval of 40% to 30%” and “the circuit category has aconfidence interval of 10% to 0%”. The confidences CF of Table 6 aredifferent from that of Table 4, but the classification confidencedistribution CCD of Table 6 is identical to that of Table 4.

TABLE 6 Confidence Category Confidence Interval Scratch 31% 40% to 30%Crack 32% 40% to 30% Circuit  5% 10% to 0%

As the machine learning classification model 200 continues to identifythe identification data DT, more and more classification confidencedistributions CCD will be recorded, wherein some of the recordedclassification confidence distributions CCD are identical.

Then, the method proceeds to step S130, the classification confidencedistributions CCD of the identification data DT are counted by thestatistical unit 140. In the present step, various classificationconfidence distributions CCD are accumulated by the statistical unit140, and the cumulative counts are shown on the user interface 300 forrecommendation.

Then, the method proceeds to step S140, some of the identification dataDT are collected by the collection unit 150 according to the cumulativecounts of the classification confidence distributions CCD. Thecollection unit 150 collects the identification data DT corresponding tothe highest cumulative count of the classification confidencedistributions CCD. For example, if the highest cumulative count of theclassification confidence distribution CCD is 13, this implies thatthere are 13 items of identification data DT corresponding to theclassification confidence distributions CCD, and the collection unit 150collects the 13 items of identification data DT.

Then, the method proceeds to step S150, whether the collectedidentification data DT belong to a new category is determined by thedetermination unit 160. The new category refers to a category notincluded in the categories CG defined by the machine learningclassification model 200. For example, the determination unit 160 canautomatically make determination using an algorithm, such as k-meansalgorithm. Or, the determination unit 160 can receive an inputtedmessage from an operator to confirm whether the identification data DTbelong to a new category. If the collected identification data DT belongto a new category (not included in the defined categories CG), themethod proceeds to step S160; if the collected identification data DT donot belong to a new category (but belong to one of the definedcategories CG), the method proceeds to step S170.

In step S160, a new category, such as “micro-particle” category CG′, isadded by the category addition unit 170.

Then, the method proceeds to step S161, data are generated for the newcategory CG′ by the data generation unit 190 to obtain several generateddata DT′. The data generation unit 190 generates data using such as agenerative adversarial network (GAN) algorithm or a domain randomizationalgorithm. In the present step, data are generated for the new categoryCG′, such as a dummy “micro-particle” category, to obtain severalgenerated data DT′.

Then, the method proceeds to step S180, the generated data DT′ areinputted to the machine learning classification model 200 with the newcategory by the input unit 110 to train the machine learningclassification model 200. Thus, the features of the machine learningclassification model 200 can be modified, such that the modified machinelearning classification model 200 can correctly identify the newcategory CG′.

In an embodiment, the step S170 can be omitted, and the existingidentification data DT are directly identified and trained by themachine learning classification model 200 according to the existingcategory CG and the new category CG′. Thus, the features of the machinelearning classification model 200 can be modified, such that themodified machine learning classification model 200 can correctlyidentify the new category CG′.

In step S170, at least one physical feature PC of the collectedidentification data DT is extracted by the feature extraction unit 180.All of the collected identification data DT belong to the definedcategory CG but are not correctly identified. Thus, the training datastill have some drawbacks and need to be improved. Most of the existingidentification data DT are cracks or notches at the edge, but the 13items of identification data DT collected by the collection unit 150 arecracks at the central position of the wafer and are not correctlyclassified as the “crack” category CG by the machine learningclassification model 200.

Then, the method proceeds to step S171, data are generated by the datageneration unit 190 according to the physical feature PC to obtainseveral generated data DT′. The generated data have similar physicalfeature PC to enhance the existing identification data DT. For example,the data generation unit 190 can generate some generated data DT′ havingcracks at the central position and pre-mark the positions of the cracks.

Then, the method proceeds to step S180, the generated data DT′ areinputted to the machine learning classification model 200 by the inputunit 110 to train the machine learning classification model 200. Thus,the features of the machine learning classification model 200 can bemodified, such that the corrected machine learning classification model200 can correctly identify the identification data DT whose cracks areat the central positions of the wafer.

In step S171, the quantity of the generated data DT′ is relevant to theclassification confidence distribution CCD lest the quantity of thegenerated data DT′ might be too large and affect the correctness of themachine learning classification model 200 or the quantity of thegenerated data DT′ might be too small and cannot enhance thecorrectness.

For example, the quantity of the generated data DT′ is negativelyrelevant with the highest confidence of classification confidencedistribution CCD. That is, to produce a desired effect, the larger thevalue of the highest confidence, the smaller the required quantity ofthe generated data DT′; the smaller the value of the highest confidence,the larger the required quantity of the generated data DT′.

In an embodiment, the quantity of the generated data DT′ can be arrangedas follows. When the highest confidence is greater than or equal to 60%and is less than 80%, the quantity of the generated data DT′ is 10% ofthe identification data DT; when the highest confidence is greater thanor equal to 40% and is less than 60%, the quantity of the generated dataDT′ is 15% of the identification data DT; when the highest confidence isgreater than or equal to 20% and is less than 40%, the quantity of thegenerated data DT′ is 20% of the identification data DT; when thehighest confidence is less than 20%, the quantity of the generated dataDT′ is 25% of the identification data DT.

Besides, in step S130, the cumulative counts are shown on the userinterface 300 for recommendation. An example of the user interface 300is disclosed below. Referring to FIG. 3, a schematic diagram of a userinterface 300 according to an embodiment is shown. The user interface300 includes a recommendation window W1, a classification confidencedistribution window W2, a set addition button B1 and a classificationconfidence distribution modifying button B2. The recommendation windowW1 is configured to show several optimized recommendation data sets S1,S2, S3, . . . , etc. The identification data DT of the optimizedrecommendation data set S1 have identical classification confidencedistribution CCD. The identification data DT of the optimizedrecommendation data set S2 have identical classification confidencedistribution CCD. The identification data DT of the optimizedrecommendation data set S3 have identical classification confidencedistribution CCD. When the user clicks the optimized recommendation dataset S1, the classification confidence distribution window W2 will showthe classification confidence distribution CCD of the identificationdata DT of the optimized recommendation data set S1.

The optimized recommendation data set S1, S2, S3, . . . , etc. aresorted according to a descending order of the cumulative counts of theclassification confidence distributions CCD.

The set addition button B1 is configured to add a user-defined optimizeddata set S1′. The classification confidence distribution modifyingbutton B2 is configured to modify the classification confidencedistribution CCD of the user-defined optimized data set S1′. That is, inaddition to the optimized recommendation data set S1, S2, S3, . . . ,etc. which are recommended according to the cumulative counts of theclassification confidence distributions CCD, the user can define thecontents of the classification confidence distribution CCD to generate auser-defined optimized data set S1′ and obtain a correspondingidentification data DT.

The user can tick one or more optimized recommendation data sets S1, S2,S3, . . . , etc. or the user-defined optimized data set S1′ to determinewhich of the identification data DT are used for subsequent datageneration.

According to the above embodiments, the training system 1000 and theadjusting method for the machine learning classification model 200 cansupplementarily train the machine learning classification model 200using the feature extraction unit 180 and the data generation unit 190to increase the correctness of identification. Moreover, the trainingsystem 1000 and the adjusting method can supplementarily train themachine learning classification model 200 using the category additionunit 170 to increase the breadth of identification.

It will be apparent to those skilled in the art that variousmodifications and variations can be made to the disclosed embodiments.It is intended that the specification and examples be considered asexemplary only, with a true scope of the disclosure being indicated bythe following claims and their equivalents.

What is claimed is:
 1. An adjusting method for a machine learningclassification model, wherein the machine learning classification modelis used to identify a plurality of categories, and the adjusting methodcomprises: inputting a plurality of identification data to the machinelearning classification model to obtain a plurality of confidences ofthe categories for each of the identification data; recording aclassification confidence distribution for each of the identificationdata whose highest value of the confidences is not greater than acritical value; counting the classification confidence distributions ofthe identification data; collecting some of the identification dataaccording to cumulative counts of the classification confidencedistributions; determining whether the collected identification databelong to a new category; and adding the new category if the collectedidentification data belong to the new category.
 2. The adjusting methodfor the machine learning classification model according to claim 1,wherein after the new category is added, the adjusting method furthercomprises: inputting the identification data to the machine learningclassification model with the new category to train the machine learningclassification model.
 3. The adjusting method for the machine learningclassification model according to claim 1, wherein after the newcategory is added, the adjusting method further comprises: generatingdata for the new category to obtain a plurality of generated data; andinputting the generated data to the machine learning classificationmodel with the new category to train the machine learning classificationmodel.
 4. The adjusting method for the machine learning classificationmodel according to claim 1, further comprising: extracting at least onephysical feature of the collected identification data if the collectedidentification data do not belong to the new category; generating datato obtain a plurality of generated data according to the at least onephysical feature; and inputting the generated data to the machinelearning classification model to train the machine learningclassification model.
 5. The adjusting method for the machine learningclassification model according to claim 4, wherein in the step ofgenerating data, quantity of the generated data is relevant to theclassification confidence distribution.
 6. The adjusting method for themachine learning classification model according to claim 5, wherein inthe step of generating data, the quantity of the generated data isnegatively relevant to a highest confidence of the classificationconfidence distribution.
 7. The adjusting method for the machinelearning classification model according to claim 6, wherein in the stepof generating data, when the highest confidence is greater than or equalto 60% and is less than 80%, the quantity of the generated data is 10%of the identification data; when the highest confidence is greater thanor equal to 40% and is less than 60%, the quantity of the generated datais 15% of the identification data; when the highest confidence isgreater than or equal to 20% and is less than 40%, the quantity of thegenerated data is 20% of the identification data; when the highestconfidence is less than 20%, the quantity of the generated data is 25%of the identification data.
 8. The adjusting method for the machinelearning classification model according to claim 6, wherein thecumulative counts are shown on a user interface.
 9. A training systemfor a machine learning classification model, wherein the machinelearning classification model is used to identify a plurality ofcategories, and the training system comprises: an input unit configuredto input a plurality of identification data; the machine learningclassification model configured to obtain a plurality of confidences ofthe categories for each of the identification data; a recording unitconfigured to record a classification confidence distribution for eachof the identification data whose highest value of the confidences is notgreater than a critical value; a statistical unit configured to countthe classification confidence distributions of the identification data;a collection unit configured to collect some of the identification dataaccording to cumulative counts of the classification confidencedistributions; a determination unit configured to determine whether thecollected identification data belong to a new category; and a categoryaddition unit configured to add a new category if the collectedidentification data belong to the new category.
 10. The training systemfor the machine learning classification model according to claim 9,wherein after the new category is added, the input unit further inputsthe identification data to the machine learning classification modelwith the new category to train the machine learning classificationmodel.
 11. The training system for the machine learning classificationmodel according to claim 9, further comprising: a data generation unitconfigured to generate data to obtain a plurality of generated dataafter the new category is added; wherein the input unit inputs thegenerated data to the machine learning classification model with the newcategory to train the machine learning classification model.
 12. Thetraining system for the machine learning classification model accordingto claim 9, further comprising: a feature extraction unit configured toextract at least one physical feature of the collected identificationdata if the collected identification data do not belong to the newcategory; and a data generation unit configured to generate data toobtain a plurality of generated data according to the at least onephysical feature; wherein the input unit further inputs the generateddata to the machine learning classification model to train the machinelearning classification model.
 13. The training system for the machinelearning classification model according to claim 12, wherein quantity ofthe generated data is relevant to the classification confidencedistribution.
 14. The training system for the machine learningclassification model according to claim 13, wherein the quantity of thegenerated data is negatively relevant to a highest confidence of theclassification confidence distribution.
 15. The training system for themachine learning classification model according to claim 14, whereinwhen the highest confidence is greater than or equal to 60% and is lessthan 80%, the quantity of the generated data is 10% of theidentification data; when the highest confidence is greater than orequal to 40% and is less than 60%, the quantity of the generated data is15% of the identification data; when the highest confidence is greaterthan or equal to 20% and is less than 40%, the quantity of the generateddata is 20% of the identification data; when the highest confidence isless than 20%, the quantity of the generated data is 25% of theidentification data.
 16. The training system for the machine learningclassification model according to claim 9, further comprising: a userinterface used to show the cumulative counts.
 17. A user interface for auser to operate a training system for a machine learning classificationmodel, wherein the machine learning classification model is used toidentify a plurality of categories, after the machine learningclassification model receives a plurality of identification data, themachine learning classification model obtains a plurality of confidencesof the categories for each of the identification data, and the userinterface comprises: a recommendation window configured to show aplurality of optimized recommendation data sets; and a classificationconfidence distribution window, wherein when one of the optimizedrecommendation data sets is clicked, the classification confidencedistribution window shows a classification confidence distribution ofthe optimized recommendation data set which is clicked.
 18. The userinterface according to claim 17, further comprising: a set additionbutton configured to add a user-defined optimized data set.
 19. The userinterface according to claim 17, further comprising: a classificationconfidence distribution modifying button used to modify a classificationconfidence distribution of the user-defined optimized data set.
 20. Theuser interface according to claim 17, wherein the recommendation windowis sorted according to cumulative counts of the classificationconfidence distributions for the optimized recommendation data sets.